CERTIFICATE OF MAILING BY "EXPRESS MAIL" 

"Express Mail" mailing label number EK71 92361 35US 

Date of Deposit: 

I hereby certify that this paper or fee is being deposited with 
the United States Postal Service "Express Mail Post Office to 
Addressee" service under 37 CFR 1.10 on the date inscribed above 
and is addressed to the Assistant Commissioner of Patents, Box 
PATENT APPLICATION, Washington, D.C. 20231 . 

JOAN PENNINGTON 
(Typed or printed name of person mailing paper a 




-1- 

METHOD AND APPARATUS FOR PROVIDING LOCATION-SPECIFIC 
RESPONSES IN AN AUTOMATED VOICE RESPONSE SYSTEM 



Field of the Invention 

The present invention relates generally to the data processing field, 
and more particularly, relates to a method, apparatus and computer program 
product for providing location-specific responses in an automated voice 
response system. 

Description of the Related Art 

Systems capable of performing speech recognition are known in the 
prior art. For example, known systems respond to a spoken word by 
producing the textual spelling, or some other symbolic output, associated 
with that word. 

The automatic recognition of spoken speech can be used for many 
applications. For example, a voice recognition system may be used for 
controlling a plurality of different devices. 

A need exists for an automated, flexible and efficient voice response 
system. It is desirable to provide such an automated, flexible and efficient 
voice response system for controlling a plurality of different devices. It Is 
desirable to provide such an automated, flexible and efficient voice response 
system including location-specific responses for controlling a plurality of 
different devices. 
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Summary of the Invention 



A principal object of the present invention is to provide a method, 
apparatus and computer program product for providing location-specific 
responses in an automated voice response system. Other important objects 
of the present invention are to provide such method, apparatus and 
computer program product for providing location-specific responses in an 
automated voice response system that efficiently and effectively facilitates a 
determination of an intent of a spoken command; to provide such method, 
apparatus and computer program product substantially without negative 
effect; and that overcome many of the disadvantages of prior art 
arrangements. 

In brief, a method, apparatus and computer program product are 
provided for providing location-specific responses in an automated voice 
response system. A microphone signal is received from each of a plurality of 
microphones. The microphones are located within a defined environment. 
A spoken command is identified utilizing voice recognition responsive to the 
received microphone signals. A sound origin or sound location vector is 
identified responsive to each identified spoken command from respective 
ones of the plurality of microphones. A response command is provided 
based upon the identified sound location vector. 

Brief Description of the Drawings 

The present invention together with the above and other objects and 
advantages may best be understood from the following detailed description 
of the preferred embodiments of the invention illustrated in the drawings, 
wherein: 

FIG. 1 is a block diagram representation illustrating a processor 
automated voice response system for implementing location-specific 
responses in accordance with the preferred embodiment; 

FIG. 2 is a more detailed diagram illustrating the automated voice 
response system for implementing location-specific responses of FIG. 1 in 
accordance with the preferred embodiment; 
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FIGS. 3 and 4 are diagrams illustrating exemplary details of the digital 
analysis unit of the automated voice response system for implementing 
location-specific responses in accordance with the preferred embodiment; 

FIG. 5 is a flow chart illustrating exemplary sequential steps for 
5 implementing location-specific responses in an automated voice response 
system in accordance with the preferred embodiment; and 

FIG. 6 is a block diagram illustrating a computer program product in 
accordance with the preferred embodiment. 

Detailed Description of the Preferred Embodiments 

10 Having reference now to the drawings, in FIG. 1 , there is shown an 

automated voice response system of the preferred embodiment generally 
designated by the reference character 100. As shown in FIG. 1, automated 
voice response system 100 includes a processor or central processor unit 
(CPU) 102. CPU 102 is adapted for selectively controlling at least one of a 

1 5 plurality of different devices 1-3, 1 04 responsive to an identified spoken 

command indicated by block labeled SOUND 110. A user interface (Ul) 200 
connects the CPU 102 to a plurality of microphones 1-N, 114 located within 
an environment 116 wired with the microphones. User interface (Ul) 200 
also operatively couples the CPU 102 to the plurality of different devices 1-3, 

20 1 04 to selectively provide predefined controlled operations of the devices 
104. The automated voice response system 100 includes a memory 120 
storing a location-specific response program 122 of the preferred 
embodiment and a plurality of predefined response commands 124 issued 
by GPU 102 for operatively controlling the devices 1-3, 104. 

25 Centra! processor unit 1 02 is suitably programmed to execute the flow 

chart of FIG. 5 of the preferred embodiment for implementing location- 
specific responses of the preferred embodiment. The processor automated 
voice response system 100 may be implemented using any suitable 
processor system, or computer, such as an IBM personal computer running 

30 the OS/2® operating system. 



In accordance with features of the invention, the automated voice 
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response system 100 processes a sound input from the microphones 1-N, 
114 performing voice recognition to identify spoken commands and signal 
analysis to identify the location of a sound's origin within environment 114. 
The identified physical location of the person uttering a spoken command is 
used as a discriminating criterion by the automated voice response system 
100 to select one of the stored automated response commands 124 for 
controlling different devices 1-3, 104. 

Referring now to FIG. 2, the automated voice response system 100 
including user interface 200 is shown in more detail. User interface 200 
includes a respective analog-to-digital converter (ADC) 204 coupled to each 
of the microphones 1-N, 1 14. ADC 204 receives and digitizes an analog 
audio signal from its associated microphone 114 and applies the digitized 
audio signal to a clock adder 206. A synchronized time signal is added by 
the clock adder 206 to the digitized audio signal and then applied to both a 
respective voice recognition unit (VRU) 208 and a respective channel input 
of a digital analysis unit 300. Digital analysis unit 300 includes a respective 
digital buffer 210 and a signal analysis buffer 212 for each respective 
channel input 1-N corresponding to digitized, clock added signals for the 
microphones 1-N, 1 14. A command status word (CSW) register 216 is 
connected to each VRU 208 and to the CPU 102. When a particular VRU 
208 identifies a spoken command, a bit corresponding to the particular VRU 
208 is set in the CSW 216. CPU 102 polls the CSW 21 6. When the CPU 
102 detects that a bit has been set in the CSW 216, CPU 102 interrogates 
the corresponding VRU 208 for a command ID (CID), a start time of the 
command Tq, and a length of the command as a measure of time T... Upon 
receiving the command information, CPU 102 signals the digital analysis unit 
300 via a snap block 218 and an analyze block 220 to analyze the identified 
spoken command signal. Digital analysis unit 300 returns a location vector 
to the CPU 102 indicated at a line labeled LOCATION. User interface 200 
includes a respective digital-to-analog converter (DAC) 222 coupled between 
CPU 102 and each of the different devices 104 (one shown in FIG. 2). 
Responsive to the location signal provided by the digital analysis unit 300, 
CPU 102 then applies a location-specific response for selectively controlling 
at least one of a plurality of different devices 1-3, 104. 



FIG. 3 illustrates an exemplary digital analysis unit 300A receiving 
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channel inputs 1-N. CPU 102 provides a locate sound input including 
channel #, the command start time Tq, and the command length T,, to the 
digital analysis unit 300A. Digital analysis unit 300A provides a location 
vector (Xi, X2, X3, ... XJ of the origin of sound 110 In the environment 1 16 
that is applied to the CPU 102. 

FIG. 4 illustrates another exemplary digital analysis unit 300B 
receiving channel inputs 1-N respectively coupled to a corresponding first-in 
first-out (FIFO) digital buffer 402. CPU 102 provides a locate sound Input 
including channel #, the command start time Tq, and the command length 
to a frame snap (FS) function 404 In the digital analysis unit 300B. An 
analysis buffer 408 is coupled to FIFO digital buffers 402 via the FS function 
404. FS function 404 captures a region from the FIFO digital buffers 402 
into the analysis buffer 408 for phase-relation analysis, performed by a 
locator function 41 0. Locator function 41 0 operates on the captured region 
from the FIFO digital buffers 402 in analysis buffer 408, extracting salient 
signal features, and determining the phase shift and volumes of input 
frequencies from respective microphones 114, thereby locating the origin of 
sound 1 1 0 in the environment 1 1 6. Digital analysis unit 300B provides a 
location vector (X^, X2, X3, ... X„) that is applied to the CPU 102. 

Referring now to FIG. 5, there are shown exemplary sequential steps 
for implementing location-specific responses in the automated voice 
response system 100 in accordance with the preferred embodiment. The 
sequential steps begin when a command is spoken as indicated in a block 
500 and sound enters the plurality of microphones 1-N, 114 as indicated in a 
block 502. The microphone signal is digitized and a clock signal Is added to 
the digitized microphone signal by a respective ADC 204 and the clock 
adder 206 as indicated in a block 504. A spoken command is recognized by 
one or more VRU 208 as Indicated in a block 506. The spoken command 
identified at block 506 is limited to commands that start with a given phrase 
or prefix word, such as "computer". Also, the spoken command Identified at 
block 506 can be limited to commands spoken by a particular person. VRU 
208 advantageously can be adapted to identify a particular person before 
certain spoken commands are processed, for example, in order to implement 
parental control of a particular device 104. 
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Each VRU 208 recognizing the spoken command at block 506, 
(VRUn), stores the command start time Tq, and the command length T,. for 
the identified command and sets a bit in the command status word (CSW) 
216 as indicated in a block 508. CPU 102 detects the bit in the command 
status word (CSW) 216 and retrieves the command start time To, and the 
command length T^^ for the identified command from the respective VRUn as 
indicated in a block 510. CPU 102 passes the VRU channel number n, the 
command start time Tq, and the command length for the identified spoken 
command to the digital analysis unit (DAU) 300 as indicated in a block 512. 
DAU 300 analyzes the sound for each identified spoken command, taking 
key information from each VRU channel number n, and determines a sound 
location vector as indicated in a block 514. 

DAU 300 analyzes the sound signal for each identified spoken 
command of each VRU channel number, for example, by comparing phases 
and/or volumes of input frequencies to locate the sound origin in space. 
DAU 300 returns the sound location vector (X,, X2, X3, ... XJ of the origin of 
sound 1 10 in the environment 1 16 to the CPU 102 as indicated in a block 
516. CPU 102 uses the sound location vector (X^, X2, X3, ... X,) of the origin 
of sound 1 10 in the environment 1 16 to determine, for example, the validity, 
applicability, and intent of the spoken command. CPU 102 applies a 
particular command to controlled device 104 based upon the sound location 
vector (Xi, X2, X3, ... XJ as indicated in a block 518. Then CPU 102 clears 
the CSW 216 as indicated in a block 520. Then the sequential steps return 
to block 500 following entry point A for processing a next spoken command. 

It should be understood that many variations of the exemplary steps 
performed by the automated voice response system 100 can be provided. 
One variation would be to only perform the location analyses when the 
identified spoken command indicates the location analyses is necessary. 
For example, the spoken command, "computer, lock up the house" would 
have no locational component, while the spoken command, "computer, lock 
this door" would have a locational component. Another variation would 
screen commands that originated from certain fixed locations, such as stereo 
speakers or intercoms, so that the location analyses would not be 
performed. Also the automated voice response system 100 can be arranged 
to process the microphone signal from one VRU 208, which was passed the 
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loudest signal from the array of microphones inputs. 

Referring now to FIG. 6, an article of manufacture or a computer 
program product 600 of the invention is illustrated. The computer program 
product 600 includes a recording medium 602, such as, a floppy disk, a high 

5 capacity read only memory in the form of an optically read compact disk or 
CD-ROM, a tape, a transmission type media such as a digital or analog 
communications link, or a similar computer program product. Recording 
medium 602 stores program means 604, 606, 608, 610 on the medium 602 
for carrying out the methods for implementing location-specific responses in 

10 the system 100 of FIG. 1. 

A sequence of program instructions or a logical assembly of one or 
more interrelated modules defined by the recorded program means 604, 
606, 608, 610, direct the automated voice response system 100 for 
implementing location-specific responses of the preferred embodiment. 

15 While the present invention has been described with reference to the 

details of the embodiments of the invention shown in the drawing, these 
details are not intended to limit the scope of the invention as claimed in the 
appended claims. 
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